| Name | Admin No | Class | Assignment |
|---|---|---|---|
| Goh Rui Zhuo | 2222329 | DAAA/2B/05 | Deep Learning CA1 |
The goal of this part is to implement an image classifier using deep learning network:
Build two models and compare the accuracies for each ones
What is image classification ? Image classification stands out with its irreplaceable role in modern technology. It involves assigning a label or tag to an entire image based on preexisting training data of already labeled image
!pip install tensorflow_addons keras-tuner pandas matplotlib seaborn scikit-learn tqdm
Requirement already satisfied: tensorflow_addons in /usr/local/lib/python3.11/dist-packages (0.22.0) Requirement already satisfied: keras-tuner in /usr/local/lib/python3.11/dist-packages (1.4.6) Requirement already satisfied: pandas in /usr/local/lib/python3.11/dist-packages (2.1.3) Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (3.8.1) Requirement already satisfied: seaborn in /usr/local/lib/python3.11/dist-packages (0.13.0) Requirement already satisfied: scikit-learn in /usr/local/lib/python3.11/dist-packages (1.3.2) Requirement already satisfied: tqdm in /usr/local/lib/python3.11/dist-packages (4.66.1) Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from tensorflow_addons) (23.1) Requirement already satisfied: typeguard<3.0.0,>=2.7 in /usr/local/lib/python3.11/dist-packages (from tensorflow_addons) (2.13.3) Requirement already satisfied: keras in /usr/local/lib/python3.11/dist-packages (from keras-tuner) (2.14.0) Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from keras-tuner) (2.31.0) Requirement already satisfied: kt-legacy in /usr/local/lib/python3.11/dist-packages (from keras-tuner) (1.0.5) Requirement already satisfied: numpy<2,>=1.23.2 in /usr/local/lib/python3.11/dist-packages (from pandas) (1.26.0) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.11/dist-packages (from pandas) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas) (2023.3.post1) Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.11/dist-packages (from pandas) (2023.3) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib) (1.2.0) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib) (4.44.0) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib) (1.4.5) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.11/dist-packages (from matplotlib) (10.1.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/lib/python3/dist-packages (from matplotlib) (2.4.7) Requirement already satisfied: scipy>=1.5.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn) (1.11.3) Requirement already satisfied: joblib>=1.1.1 in /usr/local/lib/python3.11/dist-packages (from scikit-learn) (1.3.2) Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn) (3.2.0) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->keras-tuner) (3.2.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->keras-tuner) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->keras-tuner) (2.0.5) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->keras-tuner) (2023.7.22) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv [notice] A new release of pip is available: 23.2.1 -> 23.3.1 [notice] To update, run: python3 -m pip install --upgrade pip
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
from sklearn.metrics import classification_report,accuracy_score, confusion_matrix
from sklearn.decomposition import PCA
from sklearn.preprocessing import Normalizer
from sklearn.metrics import classification_report, accuracy_score
import os, time, math, datetime, warnings, pytz, glob
from IPython.display import display
from functools import reduce
import absl.logging
from tqdm import tqdm
import logging
absl.logging.set_verbosity(absl.logging.ERROR)
logging.getLogger('tensorflow').disabled = True
warnings.filterwarnings('ignore')
import tensorflow as tf
from tensorflow.keras.utils import Sequence, to_categorical
from tensorflow import expand_dims
from tensorflow.keras import Sequential
from tensorflow.keras import layers as L
from tensorflow.keras import backend as K
from tensorflow.image import random_flip_left_right, random_crop, resize_with_crop_or_pad
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.layers import (Dense, Input, InputLayer, Normalization, Flatten,BatchNormalization,
Dropout,Conv2D, GlobalAveragePooling2D, MaxPooling2D, ReLU, Layer,Activation, Multiply, AveragePooling2D,
Add, RandomRotation,Resizing, Rescaling, Reshape, Concatenate, concatenate, Lambda,LeakyReLU)
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, LearningRateScheduler, ReduceLROnPlateau, TerminateOnNaN, TensorBoard, CSVLogger, Callback
from tensorflow.keras.backend import clear_session
from tensorflow.keras.optimizers import RMSprop, SGD, Adam, Adagrad, Adamax
from tensorflow.keras.regularizers import l2, L2
from tensorflow.keras.optimizers.schedules import *
from tensorflow.keras.metrics import FalseNegatives, categorical_crossentropy, sparse_categorical_crossentropy
from tensorflow.keras.regularizers import l2
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.image import *
from tensorflow_addons.optimizers import SWA
from kerastuner.tuners import Hyperband
from kerastuner import HyperModel
# Setting a seaborn style
sns.set(style="whitegrid")
seed = 32
tf.random.set_seed(seed)
np.random.seed(seed)
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(f"{len(gpus)} Physical GPUs, {len(logical_gpus)} Logical GPU")
except RuntimeError as e:
print(e)
1 Physical GPUs, 1 Logical GPU
!nvidia-smi
Sun Nov 12 09:32:37 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 On | 00000000:01:00.0 Off | N/A |
| 0% 28C P8 27W / 305W | 1296MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
data = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/train' ,
color_mode='rgb',
image_size=(224,224))
data
Found 9028 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
val_data = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/validation' ,
color_mode = 'rgb',
image_size=(224,224) )
val_data
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
test_data = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/test' ,
color_mode = 'rgb',
image_size=(224,224) )
test_data
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_train = []
y_train = []
# This gets all the images data into the array
for images, labels in tqdm(data):
X_train.append(images)
y_train.append(labels)
X_train = np.concatenate(X_train, axis=0)
y_train = np.concatenate(y_train, axis=0)
100%|██████████| 283/283 [00:01<00:00, 152.02it/s]
(X_train.shape, y_train.shape)
((9028, 224, 224, 3), (9028,))
X_val = []
y_val = []
# This gets all the images data into the array
for images, labels in tqdm(val_data):
X_val.append(images)
y_val.append(labels)
X_val = np.concatenate(X_val, axis=0)
y_val = np.concatenate(y_val, axis=0)
100%|██████████| 94/94 [00:00<00:00, 176.33it/s]
(X_val.shape, y_val.shape)
((3000, 224, 224, 3), (3000,))
X_test = []
y_test = []
# This gets all the images data into the array
for images, labels in tqdm(test_data):
X_test.append(images)
y_test.append(labels)
X_test = np.concatenate(X_test, axis=0)
y_test = np.concatenate(y_test, axis=0)
100%|██████████| 94/94 [00:00<00:00, 152.55it/s]
(X_test.shape, y_test.shape)
((3000, 224, 224, 3), (3000,))
Here is to import the dataset and proceed to do analysis on it
This is so that we are able to use it for EDA and model training later on
labels_dict = os.listdir('Dataset for CA1 part A/train')
labels_dict = {idx: label for idx, label in enumerate(labels_dict)}
print(labels_dict)
{0: 'Bean', 1: 'Bitter_Gourd', 2: 'Bottle_Gourd', 3: 'Brinjal', 4: 'Broccoli', 5: 'Cabbage', 6: 'Capsicum', 7: 'Carrot', 8: 'Cauliflower', 9: 'Cucumber', 10: 'Papaya', 11: 'Potato', 12: 'Pumpkin', 13: 'Radish', 14: 'Tomato'}
The idea here is to identify class imbalance throughout the dataset as imbalance dataset may cause poor performance for the class with less representation, impacting overall performance
def get_classes_distribution(data):
label_counts = {}
total_samples = 0
# Looping through the dataset
for batch in tqdm(data):
labels = batch[1]
# Splitting into the unique labels and counts
unique_labels, counts = tf.unique(labels)
for label, count in zip(unique_labels.numpy(), counts.numpy()):
# If else to check whether label in label counts dictionary
if label not in label_counts:
label_counts[label] = count
else:
label_counts[label] += count
total_samples += len(labels)
label_counts = {vegetable: label_counts[index] for index, vegetable in labels_dict.items()}
# TO get the percentage here
for label, count in tqdm(label_counts.items()):
percent = (count / total_samples) * 100
print("Label {} contains: {} samples, {}%".format(label, count, percent))
return label_counts
label_count = get_classes_distribution(data)
100%|██████████| 283/283 [00:01<00:00, 180.36it/s] 100%|██████████| 15/15 [00:00<00:00, 42452.47it/s]
Label Bean contains: 858 samples, 9.50376606114311% Label Bitter_Gourd contains: 917 samples, 10.15728843597696% Label Bottle_Gourd contains: 797 samples, 8.828090385467435% Label Brinjal contains: 786 samples, 8.706247230837395% Label Broccoli contains: 846 samples, 9.370846256092157% Label Cabbage contains: 899 samples, 9.95790872840053% Label Capsicum contains: 805 samples, 8.916703588834736% Label Carrot contains: 788 samples, 8.72840053167922% Label Cauliflower contains: 777 samples, 8.60655737704918% Label Cucumber contains: 968 samples, 10.72219760744351% Label Papaya contains: 912 samples, 10.101905183872397% Label Potato contains: 834 samples, 9.237926451041204% Label Pumpkin contains: 900 samples, 9.968985378821445% Label Radish contains: 647 samples, 7.166592822330527% Label Tomato contains: 803 samples, 8.894550287992912%
def plot_dist(count):
# Unpack the vegetable names and counts
labels, counts = zip(*count.items())
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
fig.suptitle('Vegetable Class Labels Visualization', fontsize=16, fontweight='bold')
bars = ax.barh(labels, counts, color=sns.color_palette("viridis", len(labels)))
# Set the label
ax.set_xlabel('Total Count', fontsize=12)
ax.set_ylabel('Vegetable Names', fontsize=12)
# add the count here
for bar in bars:
width = bar.get_width()
label_x_pos = width + max(counts) * 0.01
ax.text(label_x_pos, bar.get_y() + bar.get_height()/2, f'{width}', va='center')
plt.show()
plot_dist(label_count)
Process of picking random imjages and visualise it
def random_image_visualization(X_data, y_data, labels_dict, num_images=10, new=False, size=None):
# Get a random index here
random_indices = np.random.choice(X_data.shape[0], num_images, replace=False)
# Here is to calculate the number of rows
rows = num_images // 5 + int(num_images % 5 != 0)
cols = min(num_images, 5)
fig, axes = plt.subplots(rows, cols, figsize=(15, 3*rows))
axes = axes.flatten() if rows > 1 else [axes]
if new:
fig.suptitle(f'Random Image Visualisation {size}', fontsize=16, fontweight='bold')
for idx, ax in enumerate(axes):
if idx < num_images:
ax.imshow(X_data[random_indices[idx]], cmap='gray')
ax.set_title(f"Label: {labels_dict[y_data[random_indices[idx]]]}", fontsize=12)
ax.axis('off')
else:
ax.axis('off')
else:
fig.suptitle(f'Random Image Visualisation ', fontsize=16, fontweight='bold')
for idx, ax in enumerate(axes):
if idx < num_images:
ax.imshow(X_data[random_indices[idx]].astype('uint8'))
ax.set_title(f"Label: {labels_dict[y_data[random_indices[idx]]]}", fontsize=12)
ax.axis('off')
else:
ax.axis('off')
plt.show()
random_image_visualization(X_train, y_train, labels_dict, new=False)
Simlar to above random visualisation, this is to
def visualize_first_images(X_data, y_data, labels, num_images=35, new = False, size=None):
num_rows = num_images // 7 + (num_images % 7 != 0)
fig = plt.figure(figsize=(15, 2 * num_rows))
if new:
fig.suptitle(f'First 35 Images {size}', fontsize=16, fontweight='bold')
for i in range(num_images):
ax = fig.add_subplot(num_rows, 7, i+1)
ax.imshow(X_data[i], cmap='gray')
ax.set_title(labels[int(y_data[i])], fontsize=12)
ax.axis('off')
else:
fig.suptitle(f'First 35 Images ', fontsize=16, fontweight='bold')
for i in range(num_images):
ax = fig.add_subplot(num_rows, 7, i+1)
ax.imshow(X_data[i].astype('uint8'))
ax.set_title(labels[int(y_data[i])], fontsize=12)
ax.axis('off')
plt.show()
visualize_first_images(X_train, y_train, labels_dict)
To determine which methods to use for scaling of dataset
def visualise_pixel_distribution(X_data, index1, index2):
fig, ax = plt.subplots(2, 2, figsize=(15, 8))
fig.suptitle('Pixel Distribution', fontsize=15, fontweight='bold')
ax[0, 0].imshow(X_data[index1].astype(np.uint8)) # First images
ax[0, 0].axis('off')
sns.histplot(X_data[index1].flatten(), ax=ax[1, 0], kde=True, color='blue')
ax[0, 1].imshow(X_data[index2].astype(np.uint8)) # Second images
ax[0, 1].axis('off')
sns.histplot(X_data[index2].flatten(), ax=ax[1, 1], kde=True, color='green')
plt.show()
visualise_pixel_distribution(X_train, index1=0, index2=1)
Checking the mean pixke and standard deviation
mean, std = np.mean(X_train) , np.std(X_train)
print('Mean of Images:', mean)
print('Standard Deviation of Images:', mean)
Mean of Images: 106.97684 Standard Deviation of Images: 106.97684
def average_img(X_data):
average_image = np.mean(X_data.astype(np.uint8), axis=0) / 255
fig, ax = plt.subplots(figsize=(5, 5))
ax.imshow(average_image)
ax.set_title('Average Image', fontsize=15, fontweight='bold')
ax.axis('off')
plt.show()
average_img(X_train)
def mean_class(X_data):
fig, ax = plt.subplots(2, 5, figsize=(13, 8))
fig.suptitle('Mean Pixel By Class', fontsize=15, fontweight='bold')
for index, axs in enumerate(ax.ravel()):
average = np.mean(X_data[np.squeeze(y_train == index)],axis=0)
axs.imshow(average.astype(np.uint8))
axs.set_title(f'Label: {labels_dict[index]}')
axs.axis('off')
plt.show()
mean_class(X_train)
Here is to import the dataset and proceed to do analysis on it
data_small = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/train' ,
color_mode='rgb',
image_size=(31,31))
data_small
Found 9028 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 31, 31, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_train_small = []
y_train_small = []
for images, labels in tqdm(data_small):
images = tf.image.rgb_to_grayscale(images)
X_train_small.append(images)
y_train_small.append(labels)
X_train_small = np.concatenate(X_train_small, axis=0)
X_train_small = np.squeeze(X_train_small, axis=-1)
y_train_small = np.concatenate(y_train_small, axis=0)
100%|██████████| 283/283 [00:01<00:00, 179.66it/s]
val_data_small = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/validation' ,
color_mode='rgb',
image_size=(31,31))
val_data_small
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 31, 31, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_val_small = []
y_val_small = []
for images, labels in tqdm(val_data_small):
images = tf.image.rgb_to_grayscale(images)
X_val_small.append(images)
y_val_small.append(labels)
X_val_small = np.concatenate(X_val_small, axis=0)
X_val_small = np.squeeze(X_val_small, axis=-1)
y_val_small = np.concatenate(y_val_small, axis=0)
100%|██████████| 94/94 [00:00<00:00, 239.20it/s]
test_data_small = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/test' ,
color_mode='rgb',
image_size=(31,31))
test_data_small
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 31, 31, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_test_small = []
y_test_small = []
for images, labels in tqdm(test_data_small):
images = tf.image.rgb_to_grayscale(images)
X_test_small.append(images)
y_test_small.append(labels)
X_test_small = np.concatenate(X_test_small, axis=0)
X_test_small = np.squeeze(X_test_small, axis=-1)
y_test_small = np.concatenate(y_test_small, axis=0)
100%|██████████| 94/94 [00:00<00:00, 260.74it/s]
random_image_visualization(X_train_small, y_train_small, labels_dict, new=True, size='(31 x 31 Images)')
visualize_first_images(X_train_small, y_train_small, labels_dict, new=True, size='(31 x 31 Images)')
def visualise_pixel_distribution2(X_data, index1, index2):
fig, ax = plt.subplots(2, 2, figsize=(15, 8))
fig.suptitle('Pixel Distribution', fontsize=15, fontweight='bold')
ax[0, 0].imshow(X_data[index1], cmap='gray') # First images
ax[0, 0].axis('off')
sns.histplot(X_data[index1].flatten(), ax=ax[1, 0], kde=True, color='blue')
ax[0, 1].imshow(X_data[index2], cmap='gray') # Second images
ax[0, 1].axis('off')
sns.histplot(X_data[index2].flatten(), ax=ax[1, 1], kde=True, color='green')
plt.show()
visualise_pixel_distribution2(X_train_small, index1=0, index2=1)
Checking the mean pixke and standard deviation
mean, std = np.mean(X_train_small) , np.std(X_train_small)
print('Mean of Images:', mean)
print('Standard Deviation of Images:', mean)
Mean of Images: 114.36305 Standard Deviation of Images: 114.36305
def average_img(X_data):
average_image = np.mean(X_data.astype(np.uint8), axis=0) / 255
fig, ax = plt.subplots(figsize=(5, 5))
ax.imshow(average_image, cmap='gray')
ax.set_title('Average Image', fontsize=15, fontweight='bold')
ax.axis('off')
plt.show()
average_img(X_train_small)
def mean_class(X_data):
fig, ax = plt.subplots(2, 5, figsize=(13, 8))
fig.suptitle('Mean Pixel By Class', fontsize=15, fontweight='bold')
for index, axs in enumerate(ax.ravel()):
average = np.mean(X_data[np.squeeze(y_train == index)],axis=0)
axs.imshow(average, cmap='gray')
axs.set_title(f'Label: {labels_dict[index]}')
axs.axis('off')
plt.show()
mean_class(X_train_small)
Here is to import the dataset and proceed to do analysis on it
data_big = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/train' ,
color_mode='rgb',
image_size=(128,128))
data_big
Found 9028 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_train_big = []
y_train_big = []
for images, labels in tqdm(data_big):
images = tf.image.rgb_to_grayscale(images)
X_train_big.append(images)
y_train_big.append(labels)
X_train_big = np.concatenate(X_train_big, axis=0)
X_train_big = np.squeeze(X_train_big, axis=-1)
y_train_big = np.concatenate(y_train_big, axis=0)
100%|██████████| 283/283 [00:01<00:00, 211.25it/s]
val_data_big = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/validation' ,
color_mode='rgb',
image_size=(128,128))
val_data_big
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_val_big = []
y_val_big = []
for images, labels in tqdm(val_data_big):
images = tf.image.rgb_to_grayscale(images)
X_val_big.append(images)
y_val_big.append(labels)
X_val_big = np.concatenate(X_val_big, axis=0)
X_val_big = np.squeeze(X_val_big, axis=-1)
y_val_big = np.concatenate(y_val_big, axis=0)
100%|██████████| 94/94 [00:00<00:00, 187.41it/s]
test_data_big = tf.keras.utils.image_dataset_from_directory('Dataset for CA1 part A/test' ,
color_mode='rgb',
image_size=(128,128))
test_data_big
Found 3000 files belonging to 15 classes.
<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>
X_test_big = []
y_test_big = []
for images, labels in tqdm(test_data_big):
images = tf.image.rgb_to_grayscale(images)
X_test_big.append(images)
y_test_big.append(labels)
X_test_big = np.concatenate(X_test_big, axis=0)
X_test_big = np.squeeze(X_test_big, axis=-1)
y_test_big = np.concatenate(y_test_big, axis=0)
100%|██████████| 94/94 [00:00<00:00, 185.17it/s]
random_image_visualization(X_train_big, y_train_big, labels_dict, new=True, size='(128 x 128 Images)')
visualize_first_images(X_train_big, y_train_big, labels_dict, new=True, size='(128 x 128 Images)')
visualise_pixel_distribution2(X_train_big, index1=0, index2=1)
The above exploratory data analysis is done on original, 31 x 31 and 128 x 128, next we can start developing our model